
# The Impact of the U.S. Census Disclosure Avoidance System on Redistricting and Voting Rights Analysis

Christopher T. Kenny, Shiro Kuriwaki, Cory McCartan, Evan Rosenman, Tyler Simko, Kosuke Imai 

[![](<https://img.shields.io/badge/Dataverse DOI-10.7910/DVN/TNNSXG-orange>)](https://www.doi.org/10.7910/DVN/TNNSXG)
[![arXiv](https://img.shields.io/badge/arXiv-2105.14197-66a61e.svg)](https://arxiv.org/abs/2105.14197)


This repository contains replication code and data for the above paper.

## Directories

When downloading files from Dataverse, be sure to download the entire set of files preserving the subdirectory structure. This way, the zip file will come with the following subdirectories:

- `data/`: Contains source and intermediate data for replication and further analysis
- `R/`: Contains replication code, to be run in numbered order. Typically, the directory contains the script to create the data from online Census resources.
- `figs/`: Contains produced paper figures


## Data Sources

The PPMF files come from the Census demonstration page. Because each file is over 15GB, we do not duplicate them in the replication package. The dataset should be downloaded to one's local file via the script `01_download_setup.R`, which will use the `ppmf` package <https://github.com/christopherkenny/ppmf> to download online resources to the `data-raw` subdirectory.
.
All the state-specific data below listed as from VEST is provided without change in the folders listed that take the form [state abbreviation]_[election year] (e.g., `al_2018/`). This data is copyrighted by the Voting and Election Science Team under a Creative Commons Attribution license (CC BY 4.0, <https://creativecommons.org/licenses/by/4.0/>).


Alabama

- `data/AL/al.Rds`: Final dataset processed from VEST 2018 data by adding Census data using `geomander`
- `data/AL/al_2018/`: Spatial data on 2018 elections from VEST downloaded from <https://doi.org/10.7910/DVN/UBKYRU>.

Delaware

- `data/DE/de.Rds`: Final dataset processed from VEST 2020 data by adding Census data using `geomander`
- `data/DE/de_2020/`: Spatial data on 2020 elections from VEST downloaded from <https://doi.org/10.7910/DVN/UBKYRU>.

East Ramapo Central School District

- NYS Voterfile (as of 11/16/2021) geocoded with `censusxy`
- 2010 Census block-level data built using `geomander`


Louisiana

- `data/LA/la.Rds`: Final dataset at the 2010 Census voting tabulation district level processed from VEST 2018 precinct data by adding Census data using `geomander`
- `data/LA/la_2018/`: Spatial data on 2018 elections from VEST downloaded from <https://doi.org/10.7910/DVN/UBKYRU>.


Mississippi

- `data/MS/ms.Rds`: Final dataset built from official Mississippi from the Mississippi Automated Resource Information System data
- `data/MS/precincts_2010/`: Precinct shapes for Mississippi from the Mississippi Automated Resource Information System <https://www.maris.state.ms.us/HTML/DATA/Political.html#gsc.tab=0>
- `data/MS/Sen_2010_TRP1`: State senate shapes for Mississippi from the Mississippi Automated Resource Information System <https://www.maris.state.ms.us/HTML/DATA/Political.html#gsc.tab=0>


North Carolina (redistricting simulations)

- `data/NC/nc_shp.rds`: Final dataset processed from the files below. On 2022-07-01, we corrected a omission in the dataset by adding statistics under the DAS-19.61 (with prefix "v19_"). These are necessary to replicate Script 03B. 
- `data/NC/nc_adj.rds`: A adjacency file that is a slight modification of the `redist.adjacency()` output of `nc_shp.rds`. We have carefully adjusted a handful of edge cases to connect the counties properly.
- `data/NC/nc_shp_initial.rds`: Official shapefile from the state legislature. Same file as <https://github.com/alarm-redist/redist-data/blob/main/data/nc.rds>. 


North Carolina (Bayesian Improved Surname Geocoding)

- Replication data are available at the public repository <https://github.com/kosukeimai/wru-data>


Pennsylvania

- `data/PA/pa_shp.rds`: Pennsylvania shapefile with election results, processed
  from <https://dataverse.harvard.edu/dataset.xhtml?persistentId=hdl:1902.1/16389>
- `data/PA/pa_plans.rds`: Precinct-level redistricting plan assignments, processed
  from <https://perma.cc/6ZVH-WSRW>


South Carolina

- `data/SC/sc_shp.rds` processed from the final files below `sc_adj.rds` contains adjacency.
- Election data (2018) from VEST <https://doi.org/10.7910/DVN/NH5S2I> (downloaded in `data/SC/VEST`). Processed with Census data from tigris and geomander.
- `01_prep_SC_data.R` contains code to create the final data


Utah

- `data/UT/ut.Rds`: Final dataset processed from VEST 2018 data by adding Census data using `geomander`
- `data/UT/ut_2018/`: Spatial data on 2018 elections from VEST downloaded from <https://doi.org/10.7910/DVN/UBKYRU>.


Washington

- `data/WA/wa.Rds`: Final dataset processed from VEST 2018 data by adding Census data using `geomander`
- `data/WA/wa_2018/`: Spatial data on 2018 elections from VEST downloaded from <https://doi.org/10.7910/DVN/UBKYRU>.
